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Abstract — We consider a wireless device-to-device (D2D) net- 
work where the nodes have cached information from a hbrary of 
possible files. Inspired by the current trend in the standardization 
of the D2D mode for 4th generation wireless networks, we restrict 
to one-hop communication: each node place a request to a file in 
the library, and downloads from some other node which has the 
requested file in its cache through a direct communication link, 
without going through a base station. We describe the physical 
layer communication through a simple "protocol-model", based 
on interference avoidance (independent set scheduling). For this 
network we define the outage-throughput tradeoff problem and 
characterize the optimal scaling laws for various regimes where 
both the number of nodes and the files in the library grow to 
infinity. 

I. Introduction 

Wireless data traffic is increasing dramatically, with a 6600 
% increase predicted for the next five years. This is mainly due 
to wireless video streaming. Traditional methods for increasing 
the area spectral efficiency, such as use of more spectrum and 
increase in the number of base stations, are either insufficient 
to provide a suitable capacity increase, or are too expensive. 
There is thus a great need to explore alternative transmission 
strategies. 

While live streaming is a negligible portion of the wireless 
video traffic, the bulk is represented by asynchronous video 
on demand, where users request video files from some library 
(e.g., the top 100 titles in Netflix or Amazon Prime) at arbitrary 
times. Therefore, trivial uncoded multi-casting (i.e., serving 
many users with a single downlink transmission) cannot be ex- 
ploited in this context. One of the most promising approaches 
is caching, i.e., storing popular content at, or close to, the 
users. As has been pointed out, in 1 1 1, caching can be used in 
lieu of backhaul for providing content to users; for example, 
messages (e.g., video files) can be delivered during off-peak 
hours to the caches while the files can be used during peak 
traffic hours. In this paper we will particularly concentrate on 
caching at mobile devices, which is enabled by the availability 
of tens and even hundreds of GByte of largely under-utilized 
storage space in smartphones, tablets, and laptops. 

Recently, a coded multicasting scheme exploiting caching 
at the user nodes was proposed in |2|. In this scheme, a 
combination of caching and coded multicast transmission from 
a single base station is used in order to satisfy all users requests 
at the same time. The construction of the caches is combina- 



torial, and changing even a finale file in the library requires 
a complete reconfiguration of the user caches. Therefore, the 
approach is not yet practical. In this paper we focus on a quite 
different alternative that involves random independent caching 
at the user nodes and device-to-device (D2D) communication. 
We restrict to one-hop communication, inspired by the current 
trend in the standardization of a D2D mode for 4th generation 
cellular systems |^. 

A relevant and related work is given in ||4|, where multi- 
hop D2D communication is considered under a distance-based 
protocol transmission model If the aggregate distributed 
storage space in the network is larger than the total size of 
all messages, then it can be guaranteed that all users can be 
served by this network. Under assumption of a Zipf request 
distribution with parameter 7^ (to be defined later), the author 
of | 4 | design a deterministic duplication caching scheme and a 
multi-hop routing scheme that achieves order-optimal average 
throughput. 

Since we consider only single-hop communication, requir- 
ing that all users are actually served for any request is 
too constraining. Therefore, we generalize the problem by 
introducing the possibility of outages, i.e., that some request is 
not served. For the system defined in Section |ll| we define the 
outage-throughput region and obtain achievable scaling laws 
and upper bounds which are tight enough to characterize the 
constant of the leading term. Simulations agree very well with 
the scaling law leading constants. We also compare the D2D 
system under investigation with the performance of the coded 
multicast of |2 1 and with naive broadcasting from the cellular 
base station (independent messages), which can be regarded 
as today's state of the art. 

A similar setting was investigated by |6^|, where only the 
sum throughput was considered irrespectively of user outage 
probability. Furthermore, in f6l a heuristic random caching 
policy according to another Zipf distribution with a possibly 
different parameter 7c was considered. The results showed 
that the optimal throughput occurred when 7^ ^ 7^, but the 
throughput order by this heuristic random caching policy is 

^Notation: given two functions / and g, we say that: 1) f{n) = O (g(n)) 
if there exists a constant c and integer such that f{n) < cg(n) for n > N. 
2) f{n) = o {g{n)) if limn^oo = 0. 3) f{n) = Q {g{n)) if g{n) = 
O (fin)). 4) f{n) = uj {g{n)) if g{n) = o {f{n)). 5) f{n) = Q {g{n)) if 
f(n) = O {g(n)) and g(n) = O (f(n)). 
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Fig. 1. a) Grid network with n = 49 nodes (black circles) with minimum 
separation s = ^= . b) An example of single-cell layout and the interference 
avoidance TDMA scheme. In this figure, each square represents a cluster. The 
gray squares represent the concurrent transmitting clusters. The red area is the 
disk where the protocol model imposes no other concurrent transmission. R 
is the worst case transmission range and A is the interference parameter. We 
assume a common R for all the transmitter-receiver pairs. In this particular 
example, the TDMA parameter is K = 9. 







~ — T 



□I 



wants; I 20512 I 20513 I 20514 | ^Ual5 1~ 



I 1 wan 



□I 



^ants: I 10201 | 10202 | 10203 | 10204 | 16 | 17 | 



Fig. 2. Qualitative representation of our system assumptions: each user caches 
an entire file, formed by an arbitrarily large number of chunks. Then, users 
place random requests of finite sequences of chunks from files of the library, 
or random duration and random initial points. 



generally suboptimal. More importantly, the total sum through- 
put is not a sufficient characterization of the performance of 
a one-hop D2D caching network: in certain regimes of the 
number of users and file library size it can be shown that to 
achieve a high throughput only a small portion of the users 
should be served while leaving the majority of the users are 
in outage. In contrast, our outage-throughput tradeoff region 
is able to capture the notion of fairness, since it focuses on 
the minimum per-user average throughput. 

The paper is organized as follows. Section [ll| introduces the 
network model and the precise problem formulation of the 
throughput-outage trade-off in wireless D2D networks. Sec- 
tion |lll| presents the achievable throughput-outage trade-off. 



The outer bound of this trade-off is discussed in Section [IVl 
We discuss our reuslts in Section |Vl 

II. Network Model and Problem Formulation 

We consider a dense network deployed over a unit-area 
square and formed by n nodes U = {ui^ . . . ^Un} placed 
on a regular grid with minimum node distance l/^/n (see 
Fig. 1(a) ). Each user u e U makes a request to a file 
fu ^ = {/i,---,/m} in an i.i.d. manner, according to 
a given request probability mass function Pr{f)- In order 
to model the asynchronism of video on demand and forbid 
any form of "for- free" multicasting gain by "overhearing" 
transmissions dedicated to others, we assume that each file in 
the library is formed by L "chunks". For example, in current 
video streaming protocols such as DASH |3|, the video file 
is split into segments which are sequentially downloaded by 
the streaming users. The chunk downloading time is equal 
to the chunk playback time, but chunks may correspond to 
different bit-rates, depending on the video coding quality. 
Then, we assume that requests are strongly asynchronous: 
each user downloads a segment of length V of a long file 
of L chunks. We measure the cache size in files, and let 
first L ^ oo and then study the system scaling laws for 
n,m oo, with fixed V < oo. Hence, the probability of 



useful overhearing vanishes, while the probability that two 
users request the same file depends on the library size m and 
on the request distribution P^. In short, this is a conceptual 
way to decouple the overlap of the demands with the overlap 
of concurrent transmissions, which would be difficult if not 
impossible to exploit in a practical system. For the sake of 
simplicity, we assume that the user caches contain M = 1 
files (ML chunks) in the analysis. Fig. |2] shows qualitatively 
our model assumptions. 

Definition 1: (Protocol model) If a node i transmits a packet 
to node j, then the transmission is successful if and only if 

• The distance between i and j is less than R. 



d{iJ)<R. 



(1) 



For any other node k that is transmitting simultaneously. 



d{kJ)>{l^A)R. 



(2) 



R is the transmission range and A > is an interference 
control parameter. Nodes send data at a constant rate of C 
bit/s/Hz a successful transmission. 

In our model we do not consider power control (which 
would allow different transmit powers, and thus transmission 
ranges), for each user. Rather, we treat Rslssl design parameter 
that can be set as a function of m and n, but which cannot 
vary between users. 

Definition 2: (Network) A network if formed by a set of 
user nodes U = {ui^ . . . ^Un}, a set of helper nodes H = 
{/ii, . . . , hr} and a set of files T = {/i, . . . , fm}- Nodes in U 
and V, are placed in a two-dimensional unit-square region, and 
their transmissions obeys the protocol model. Helper nodes are 
only transmitters, user nodes can be transmitters and receivers. 
In general, all n{n — 1) directed links between all user nodes 
and all rn directed links between the helper nodes and the user 
nodes, together with the protocol model define a interference 
(conflict) graph. Only the links in an independent set in the 
interference graph can be active simultaneously. 



Definition 3: (Cache placement) The cache placement He 
is a rule to assign files from the library T to the user nodes U 
and the helper nodes V, with "replacement" (i.e., with possible 
replication). Let G = {UUH^T^S} be a bipartite graph with 
"left" nodes U UH, "right" nodes T and edges £ such that 
{ui ^ fj) e £ indicates that file fj is assigned to the cache of 
user node Ui and {hi^ fj) G £ indicates that file fj is assigned 
to the cache of helper node hi. A bi-partite cache placement 
graph G is feasible if the degree of each left node (user or 
helper) is not larger than its maximum cache capacity M. Let 
Q denote the set of all feasible bi-partite graphs G. Then, He 
is a probability mass function over Q, i.e., a particular cache 
placement G G ^ is assigned with probability nc(G). 

Notice that deterministic cache placements are special cases, 
corresponding to deterministic probability mass functions, a 
single probability mass equal to 1 on the desired G. In 
contrast, we will be interested in "decentralized" random 
caching placements with no helpers constructed as follows: 
each user node Ui selects its cache content in an i.i.d. manner, 
by independently generating M = 1 random file indices with 
the same caching probability mass function {Pc{f) • / ^ J^}- 

Definition 4: (Random requests) At each request time 
(integer multiples of some fixed (large) integer L'), each user 
u e U makes a request to a segment of length U of chunks 
from file fu G selected independently as fu ^ Pr- The set 
of current requests f = (/^^, . . . , fu^) is therefore a random 
vector taking on values in J^^, with product joint probability 
mass function P(f = (/m , • • • , /n^)) = HlLi ^r(/nj. 

In this paper, we assume Pr{f) follows a Zipf distribution 
with parameter < 7r- < 1, i.e., any node requests file / with 

probabihty H{jrZm) ^ ^^^^^ ^^^^^ ^(^' ^' ^) = S/=a ^ 
and / = !,••• , m. 

Definition 5: (Transmission policy) The transmission pol- 
icy lit is a rule to activate the D2D links in the network. Let 
C denote the set of all directed links. Let A C 2^ the set 
of all possible feasible subsets of links (this is a subset of 
the power set of £, formed by all sets of links corresponding 
to independent sets in the network interference graph). Let 
A c ^ denote a feasible set of simultaneously active links 
according to the protocol model. Then, 11^ is a conditional 
probability mass function over A given f (requests) and G 
(cache placement), assigning probability nt(A|f, G) to A G A. 


We may think of 11^ as a way of scheduling simultaneously 
compatible sets of links (subject to the protocol model). The 
scheduling slot duration is generally much shorter than the 
chunk playback duration. Invoking a time-scale decompo- 
sition, and provided that enough buffering is used at the 
receiving end, we can always match the average throughput 
(expressed in information bit/s) per user with the average 
source coding rate at which the video file can be streamed 
to a given user. Hence, while the chunk delivery time is fixed 
(e.g., one chunk per 0.5 seconds) the "quality" at which the 
video is streamed and reproduced at the user end depends 
on the user average throughput. Therefore, in this scenario 
we are concerned with the ergodic (i.e., long-term average) 



throughput per user. 

Definition 6: (Useful received bits per slot) For given P^, 
lie and lit, and user u eU wc define the random variable 
as the number of useful received information bits per slot unit 
time by user u at a. given scheduling time (irrelevant because 
of stationarity). This is given by 

Tu= ^u,vHfu e G{v)} (3) 

where fu denotes the file requested by user node u, Cu,v 
denotes the rate of the link {u^ v), and G{v) denotes the content 
of the cache of node v, i.e., the neighborhood of left node v 
in the cache placement graph G. 

Consistently with the protocol model, Cu^v depends only on 
the active link {u,v) G A and not on the whole set of active 
links A. Furthermore, we shall obtain most of our results under 
the simplifying assumption (usually made under the protocol 
model) that Cu,v = C for all (i^, G A. The indicator function 
l{fu G G(f)} expresses the fact that only the bits relative to 
the file fu requested by user u are "useful" and count towards 
the throughput. It is obvious that scheduling links {u,v) for 
which fu ^ G{v) is useless for the sake of the throughput 
defined as above. Hence, we could restrict our transmission 
policies to those activating only links (u^v) for which fu G 
G{v). These links are referred to as "potential links", i.e., links 
potentially carrying useful data. Potential links included in A 
are "active links", at the given scheduling slot. 

The average throughput for user node u e U is given by 
Tu ='E[Tu], where expectation is with respect to the random 
triple (f,G,A) - [iLi ^r(/u)ne(G)nt(A|f, G). Next, we 
define the condition of "user in outage" consistently with 
the qualitative system description given before. In particular, 
consider a user u and its useful received bits per slot Tu. We 
say that user u is in outage if E[T^|f, G] = 0. This condition 
captures the event that no link {u^v) with fu G G{v) is 
scheduled with positive probability, for given set of requests 
f and cache placement G. In other words, a user u for which 
E[T^|f , G] = experiences a "long" lack of service (zero rate), 
as far as the cache placement is G and the request vector is f . 

Definition 7: (Number of nodes in outage) The number of 
nodes in outage is given by 

7V, = ^l{E[T,|f,G] = 0}. (4) 

Notice that No is a random variable, function of f and G. 
Definition 8: (Average outage probability) The average 
(across the users) outage probability is given by 

p„ = -E[Arj = 1 V p (E[r„ |f , G] = 0) . (5) 

n n ^-^ 

ueu 



Here, we focus on max-min fairness, i.e., we express the 
outage-throughput tradeoff in terms of the minimum average 
user throughput, defined as 

T^in = min {Tu}. (6) 



At this point we can define the performance tradeoffs that we 
wish to characterize in this work: 

Definition 9: (Outage - Throughput Tradeoff) For a 

given network and request probabiHty distribution P^, an 
outage-throughput pair (p, t) is achievable if there exists 
a cache placement lie and a transmission poHcy 11^ with 
outage probability Po ^ P and minimum per-user average 
throughput Tmin > t. The outage-throughput achievable 
region T{Pr-,n^m) is the closure of all achievable outage- 
throughput pairs ip^t). In particular, we let T*{p) = sup{t : 
{p,t) eT{Pr,n,m)}. 
Notice that T* (p) is the result of the following optimization 
problem: 



maximize 
subject to 



T ■ 

mm 
Po <P, 



(7) 



where the maximization is with respect to the cache placement 
and transmission policies Uc^Ut- Hence, it is immediate to 
see that T*(p) is non-decreasing in the range of feasible 
outage probability, which in general is the interval [po,min, 1] 
for some Po,min > 0. Whether Po,min is equal to or it is 
strictly positive depends on the model assumptions. We say 
that that an achievable point (p, t) dominates an achievable 
point {p' ^t') if p < p' and t > t' where at least one of 
the inequalities is strict. As usual, the Pareto boundary of 
T{Pr^n^m) consists of all achievable points that are not 
dominated by other achievable points. 

III. Achievable Outage-Throughput Trade-off 

We obtain an inner lower bound on the achievable 
throughput-outage tradeoff by considering specific transmis- 
sion policy based on clustering and independent random 
caching. 

Clustering: the network is divided into clusters of equal 
size, denoted by gdm) and independent of the users' demands 
and cache placement realizations. A user can only look for the 
requested file inside the corresponding cluster. If a user can 
find the requested file inside the cluster, we say there is one 
potential link in this cluster. Moreover, if a cluster contains at 
least one potential link, we say that this cluster is good. We 
use an interference avoidance scheme for which at most one 
transmission is allowed in each cluster, on any time-frequency 
slot (transmission resource). Potential links inside the same 
cluster are scheduled with equal probability (or, equivalently, 
in round robin), such that all users have the samel throughput 
Tu = Tmin- To avoid interference between clusters, we use 
a time-frequency reuse scheme |^ Ch. 17] with parameter 
K as shown in Fig. |l(b) In particular, we can pick K = 
([^/2(l + A)]+l)l 

Random Caching: each node randomly and independently 
caches one file according to a common probability distribution 
function Pc- We shall find the optimal Pc that maximizes the 
achievable Tmin under the clustering scheme. 

In the rest of this paper, unless said otherwise, is assumed 
that n, m oo in some way (to be specified later). Proofs 



are omitted for the sake of space limitation, and are provided 
in |8|. We start by characterizing the optimal random caching 
distribution under the clustering transmission scheme. 

Theorem 1: Under the model assumptions and the clustering 
scheme, the optimal caching distribution P* that maximize 
the probability p^ that any user u eU finds its requested file 
inside its corresponding cluster is given by 

+ 



p:{f^) 



1 



where v 



, z = l,...,m, (8) 

P^(/-)^a^^, and m* 



6 (^min{;^^cM,^}]. □ 
Next, we distinguish the different regimes of small library 
size, large library size and very large library size. Letting m 
vary as a function of n, we have 

m 

- 0, small library 



lim 

m 

< const. < lim — < 

n^oo 

^ m 
lim — > 



2^ 

l-7r 

Ir 



(9) 

large library (10) 

very large library 
(11) 



where we define a 



2-7. 
1-7. 



. Then, we have: 



Theorem 2: In the small library regime, the achievable 
outage-throughput trade-off achievable by random caching and 
the clustering scheme behaves as: 



T*(p) > 



K pim -'-V / ' 
CA 1 



m{l—p) l-Tr 



^m-i/« + (54(m), 



l-7r 



1 



p <l — a{^r)^ ^^^1 
p>l — a(7^)m~-^/" 



(12) 



where a (7^) 



7r 



D 



, A = 7^1-^-, B = 
and where pi and p2 are 



l+a(7.)y^ 

positive parameters satisfying pi > 7^ and p2 > \J^^^' 
The cluster size gd'm) is any function of m satisfying 
gd'm) = ^ (m-^/^) and gd'm) < 7r^. The functions Si{m) 
i = 1,2,3,4 are vanishing for m ^ 00 with the following 

orders Si{m) = o(l/m), S2{m) = o I ^ 1 ), Ss{m), 

\m{l-p) i-7r / 

(54(m) = o(m-i/"). □ 
The results for the large and very large library regimes can 
be found in [8J. 

IV. Outer Bound 

Under the assumptions of protocol model (see Definition [T]) 
and one-hop transmission, we can provide an outer bound 



on the outage-throughput tradeoff (p, T^^(p)) such that the 
ensemble of such points for p e [0, 1] dominates the optimal 
trade-off, i.e., the ensemble of solutions of ([7]). We have: 

Theorem 3: In the small library regime, the set of points de- 
fined below dominates the optimal throughput-outage tradeoff: 

+ 6,{m), p=l- ('-^Y"'^ , 

min/ 

/i(/>3)m-i/-} + Se{m), 1 - ps^-^^m-^/^ < 

p < 1 — p4}~^''m~^^^ , 
fi{p4)m-^/'^ + drim), 1 - p4^"^^m-i/" < P < 1, 

(13) 

where ps is a positive parameter and p^ is the solution of the 
equation 

2 \2-7. 




log 1 + (2 - 7,) 1 + 



3A 



2-7r 



with respect to p, gnim) is any function 

gnim) uj (m^/^^) and gnim) < ^n, 

16C f, ^1 , 3A A 2(2-7.) ^2-7.^^^ ^^d 

= o(m-i/"). 




, (14) 
such that 

hip) = 

_ _ . _ . □ 

^ m{l—p) ^ 

The results of other regimes of m can be found in |[8|. In 
all cases, notice that the scaling laws of the throughput and 
outage probability with respect to m ^ oo coincide and are 
therefore tight up to some gap in the constants of the leading 
terms. 

V. Discussion 

In this section, we focus on the regime of small library as 
provided in Theorem [2j Specifically, we consider the regime 
of constant outage probability constraint (0 < p < 1 and 
gdm) (X m). We realistically assume that m = 1000 and 
n = 10000 (this corresponds to one node every 10 x 10m, 
in a 1 km^ area). Moreover, we let = 4 and the link rate 
for the D2D be lOMb/s pi The simulation of the throughput 
per user is shown in Fig7]3] This simulation shows that even 
for practical m and n, the dominate term in ([12]) accurately 
captures the system behavior. 

In ||2|, by using a sub-packetization based caching and a 
coded multicasting scheme, the minimum per user throughput 
scales as 9 (max { ^ , ^ } ) and this scheme can achieve a 
zero outage probability. When n > m, it has the same order 
as the minimum per user throughput with a constant outage 
probability by using our scheme. However, if we allow a small 
outage, our scheme can provide a large constant gain in terms 
of minimum per user throughput. For example, if we assume 
m = 1000 and n = 10000, and pick a realistic 7^ as 0.5 j9j 
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Y^=0.1, simulated 

Y=0.1, theoretical 

y=0-2, simulated 

y=0-2, theoretical 

Y^=0.3, simulated 

y^=0.3, theoretical 

7^=0.4, simulated 

Y^=0.4, theoretical 

y=0.5, simulated 

Y^=0.5, theoretical 

Y=0.6, simulated 
Y =0.6, theoretical 




Fig. 3. In this figure, we show a comparison between the theoretical result 
and simulated result in terms of the minimum throughput per user v.s. outage 
probability constraint. We assume m = 1000, n = 10000, K = 4 and the 
link rate for the D2D is lOMb/s. The parameter 7^ for the Zipf distribution 
varies from 0.1 to 0.6. The theoretical curve is the plot of the dominate term 
in {12}. 



and allow about 0.3 outage probability, then we can achieve a 
minimum per user throughput of 2.5 Kb/s by using the one-hop 
D2D scheme. For comparison, assuming that the cellular base 
station has a realistic common multicasting rate at 500Kb/s, 
and using the coded caching scheme of ||2|, the common (and 
therefore minimum) per user throughput is 0.5506Kb/s. In this 
case, the D2D random caching scheme yields a 5 x gain. Using 
naive broadcasting from the cellular base station (today's state 
of the art), then the minimum per user throughput is 0.05Kb/s, 
such that our scheme yields a 50x gain. 

References 

[1] N. Golrezaei, A.F. Molisch, and A.G. Dimakis, "Base station assisted 
device-to-device communications for high-throughput wireless video net- 
works," IEEE Communications Magazine, in press., 2012. 

[2] M.A. Maddah-Ali and U. Niesen, "Fundamental limits of caching," arXiv 
preprint arXiv: 1209.5807, 2012. 

[3] X. Wu, S. Tavildar, S. Shakkottai, T. Richardson, J. Li, R. Laroia, and 
A. Jovicic, "Flashlinq: A synchronous distributed scheduler for peer- 
to-peer ad hoc networks," in Communication, Control, and Computing 
(Allerton), 2010 48th Annual Allerton Conference on. IEEE, 2010, pp. 
514-521. 

[4] S. Gitzenis, GS Paschos, and L. Tassiulas, "Asymptotic laws for joint 
content replication and delivery in wireless networks," arXiv preprint 
arXiv:1201.3095, 2012. 

[5] R Gupta and RR. Kumar, "The capacity of wireless networks," Informa- 
tion Theory, IEEE Transactions on, vol. 46, no. 2, pp. 388-404, 2000. 

[6] N. Golrezaei, A.G. Dimakis, and A.F. Molisch, "Wireless device-to- 
device communications with distributed caching," in Information Theory 
Proceedings (ISIT), 2012 IEEE International Symposium on. IEEE, 2012, 
pp. 2781-2785. 

[7] A.F. Molisch, Wireless communications, John Wiley & Sons, 2011. 

[8] M. Ji, G. Caire, and A.F. Molisch, "Optimal throughput-outage trade-off 
in wireless device-to-device caching networks," In Preparation. 

[9] L. Breslau, R Cao, L. Fan, G. Phillips, and S. Shenker, "Web caching 
and zipf-like distributions: Evidence and implications," in INFOCOM'99. 
Eighteenth Annual Joint Conference of the IEEE Computer and Commu- 
nications Societies. Proceedings. IEEE. IEEE, 1999, vol. 1, pp. 126-134. 



